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Abstract 

Background: Hepatitis C Virus (HCV) infected patients are frequently repeatedly exposed to the virus, but very few 
recombinants between two genotypes have been reported. 

Findings: We describe the discovery of an HCV recombinant using a method developed in a United States clinical 
lab for HCV genotyping that employs sequencing of both 5' and 3' portions of the HCV genome. Over twelve 
months, 133 consecutive isolates were analyzed, and a virus from one patient was found with discordant 5' and 3' 
sequences suggesting it was a genotype 2b/la recombinant. We ruled out a mixed infection and mapped a 
recombination point near the NS2/3 cleavage site. 

Conclusions: This unique HCV recombinant virus described shares some features with other recombinant viruses 
although it is the only reported recombinant of a genotype 2 with a subtype la. This recombinant represents a 
conundrum for current clinical treatment guidelines, including treatment with protease inhibitors. This recombinant 
is also challenging to detect by the most commonly employed methods of genotyping that are directed primarily 
at the 5' structural portion of the HCV genome. 
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Background 

The WHO estimates that 130-170 million people world- 
wide are infected with HCV [1]. Six major genotypes 
(lineages) of HCV have spread throughout the world 
[2,3]. Viral genotype is well recognized as the most sig- 
nificant prognostic factor in terms of response to ther- 
apy, and a characteristic upon which to base the 
antiviral prescription [4]. Therefore, the best methods 
and the most definitive viral target(s) for determining 
the HCV genotype during patient care remains an 
important area of translational research. Infection with 
any genotype can lead to liver cirrhosis and liver cancer 
in a minority of patients. While specific genotypes dom- 
inate in certain regions of the world (for example, geno- 
type 4 in the Middle East), many regions have multiple 
genotypes circulating including Europe and the United 
States. People with repeated use of intravenous needles 
and contaminated blood products in these regions likely 
are exposed to more than one genotype of HCV. Both 
mixed infections and recombinant viruses have been 
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described. These situations are thought to be uncom- 
mon, particularly natural recombination between two 
genotypes. Genetic incompatibilities between the viral 
proteins of different genotypes have been suggested as a 
reason for recombination in HCV to be a rare event [5]. 

Here we describe a case report of a patient who was 
chronically infected over an extended period of time 
with a recombinant HCV strain. Like all other naturally 
occurring inter-genotypic recombinants reported to 
date, this strain has a genotype 2 5' portion of the gen- 
ome encoding the structural region, while the nonstruc- 
tural coding region is from a different genotype. The 
crossover junction was mapped to the NS2/3 region. 
We also compared the sequence to other recombinants, 
but could find limited evidence to support the proposed 
theory that stable RNA hairpin structures can promote 
recombination and bracket the cross over junction. 

Methods 

Clinical Genotyping 

Total nucleic acid was extracted from patient plasma 
using the Roche AmpliPrep® TNAI kit (Roche, India- 
napolis, IN). The nucleic acid was reverse-transcribed 
with random primers, AMV RT, and 0.2 mM 
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deoxynucleotides (Promega, Madison, WI). PGR was 
performed on a Roche LightCycler® 480 with Roche 
LightCycler® 480 SYBR Green I Master mix, primers 
(A and B in Table 1; [6-8]) specific for either the 
5'UTR or NS5B regions (TIB Molbiol, Adelphia, NJ) 
and 2uL of the RT-PGR product. Products of ampUfi- 
cation were sequenced bidirectionally using Big Dye 
(Applied Biosystems) on an Applied Biosystems 3100 
instrument. Sequences were compared to a database 
consisting of 100 reference HGV 5'UTR and 58 refer- 
ence NS5B sequences from the Los Alamos National 
Laboratory [9] using Assign-ATF software (Gonexio 
Genomics, Fremantle, Australia). 

Sequencing of Recombinant 

Seven different fragments (Figure 1, Parts of HGV gen- 
ome 1-7) of the discordant virus identified were 
sequenced. Each fragment of cDNA synthesis was done 
from viral RNA, extracted from patient serum, by 
Superscript III One-Step RT-PGR System with Platinum 
Taq DNA Polymerase (Invitrogen, GA) kit according to 
the manufacturer's protocol. Forward and reverse pri- 
mers were designed based on an alignment [10] (Glus- 
talW program available online from http://www.genome. 
jp/tools/clustalw/) of 217 genotype la sequences and 26 
genotype 2 sequences from the European HGV data- 
bases [11]. Each primer from the alignment was selected 
from regions displaying 40-70% consensus within the 
respective alignment. RT-PGR amplification was per- 
formed by adding an entirely conserved forward primer 



from the recombinant virus and both la and 2b reverse 
primers from the alignment into the reaction (Table 1) 
to generate amplification products 1-4 (Figure 1). Frag- 
ments 5, 6, and 7 were done only with la primers. PGR 
fragments were restricted to be between 900 to 2000 
basepairs. Seven fragments were amplified from RNA 
covering the full polyprotein and the 5' UTR of the 
recombinant virus. After amplification the PGR products 
were gel purified and cloned into a TOPO TA cloning 
vector (Invitrogen, GA) according to the manufacturer's 
protocol. After identifying the positive clones the cDNA 
inserts were sequenced bidirectionally with Big Dye 
(Applied Biosystems) at the UW-Madison Biotechnology 
facility. 

Bioinformatic Analysis 

A web-based program http://www.phylogeny.fr/ was 
used to analyze the recombinant strain relative to other 
HGV strains [12]. All sequencing data were collected 
and searched against the NGBI database using a web- 
based nucleotide BLAST program http://blast.ncbi.nlm. 
nih.gov/Blast.cgi. For identification of a more specific 
junction, analyses were performed using the SimPlot 
program [(Version 3.5.1) [13] available online from 
http://sray.med.som.jhmi.edu/SGRoftware/simplot] . The 
RNA secondary structure was analyzed using the 
MFOLD program with default parameters. The MFOLD 
web based program is provided by Michael Zuker, 
Rensselaer Polytechnic Institute http://mfold.rna.albany. 
edu/?q=mfold/RNA-Folding-Form. The stable RNA 



Table 1 Primers used for Amplification and DNA Sequencing of 5' UTR and NS5B regions in the identification of the 
HCV recombinant isolate (A, B) and subsequently for Amplification of Overlapping Products Across Entire HCV 
Recombinant Genome (1-7) 



Amplicon Nucleotide 
Number 
(H77 
numbering 
Accession 

no. 
AF009606) 



Forward Primer (5'-3') 
Normal: HCV genotyping primer; Bold: la primer; 
Italics: 2b primer 



Reverse Primer (5'-3') 
Normal: HCV genotyping primer; Bold: la primer;/ta//cs; 
2b primer 



A 73-3 1 3 AAGCGTCTAGCC ATGGCGT [7] 

B 8256-8644 TATGAYACCCGCTG^^GACTC [8] 

1 92-1295 TAGTATGAGTGTCGTACAGCCTCC 

2 1171 -2240 TGGGAGACGTGTGTGGGGCCGTGATGATCGTGTCGCAGGC 

3 21 08-2980 TTTTAGGAAGCACCCAGATACCACCTATCTTAAGTGTGGAGC 

4 2901-4793 TTAGTCTTGGCCGAGGCCCAGATTCAGCAATGGG 

5 4707-5789 ATAGACTGTAACACGTGTGTCACCC 

6 5758-7285 CCACTGGCCAAACCCTCCTCTTCAACATA 

7 72 1 8-9296 GCGCGGCCGG ACTACAACCCCCCGCTG ATAG AGGCGTGG 



CACTCGCAAGCACCCTATCA [6] 

GCNGARTAYCWGTCATAGCCTC [8] 

TAGGGGACCAGTTCATCATCATATCCCATGCC 

ATGACTTTGGCCCACGCTCCCTGCATGG 

TTGGTATATCCAGTTGAGTTCATCCAGGTGCAACCGAACC 

TGCGGACAATCGATGCTCCACCCCCCCTACATACATCCG 

TTGCGCTTCTGCTCTAGTCAGAAAATACTGAAGCCACC 

TGACCTCAAACACAAGGCGTGGGTGTAGAATGACAGCC 

TGGTTGTCTCAATGGTGAAGGTAGGGTCAAGGC 

TTCTCCCTCTACGTTGACTACGGGAGACAGC 

AGGACCTTCCCCAGTCCAACAC 

ACCACAGGTGGTTCGTAGTCGGGCTTTTTCCACGCC 

TTCATTATGAAAGGATCCGCGGGGTCGGGCACGAGACAGGC 
TGTG 
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Figure 1 Schematic of amplification products used to genotype discordant virus and to sequence recombinant virus. Regions of the 
HCV 5'UTR (A) and NS5B (B) were amplified on all patients at UWHC for a 12 month period of time and sequenced using primers in Table 1. 
Amplification products 1-7 were generated and sequenced separately using primers listed in Table 1. 



secondary structure for this chimeric virus was analyzed 
and compared with the stable hairpin structure 1 as 
described by [14]. The NS2-NS3 region near the cross- 
over point was also analyzed using the same program, 

GenBank nucleotide sequence accession number 

The entire nucleotide sequences of this recombinant 
virus have been submitted to GenBank and assigned 
accession number is JF779679. 

Results 

DNA Sequencing clinical isolates at two different regions 
of the HCV genome produces mostly concordant results 

Over a 12 month period, the Molecular Diagnostic lab 
at the University of Wisconsin Hospital and Clinics 
(UWHC) performed a bidirectional consensus sequen- 
cing reaction of 203 base pairs of the 5' UTR as well as 
a separate bidirectional consensus sequencing reaction 
of the 222 base pair fragment of NS5B for 133 consecu- 
tive patients. As expected, based on genotype 1 being 
the most common in the United States, almost 70% of 
the time the NS5B subtype was la (90 different 
patients). Therefore, if a patient in our population was 
exposed to multiple HCV strains both the first and the 
second exposure might be subtype la viruses. In all 
cases but one, the genotype determined by the 5' UTR 
matched that determined by the NS5B fragment (sub- 
type discrepancies between genotype 1 were unresolva- 
ble due to the lack of specificity of the 5 'UTR [15,16]. 
The only exception was an isolate with a 5' UTR that 
matched best with genotype 2b sequences (Figure 2a), 
while the corresponding NS5B sequence from that iso- 
late matched best with a genotype la sequence (Figure 
2b). Neither the 5' UTR 2b sequence nor the NS5B 
sequence perfectly matched with previously determined 



sequences in our patient population during this study 
period, although the 2b sequence was identical to an 
isolate from one patient prior to this twelve month per- 
iod of time which had been identified as a 2b 5' UTR 
(see Figure 2a). Sequencing of the isolate with the near 
identical 2b 5' UTR sequence by NS5B primers was con- 
sistent with a 2b genotype. Furthermore the uridine 
position at base pair 204 of the 5' UTR (numbering 
according to H77 GenBank # AF009606) from this iso- 
late was found to be a mixed population containing 
both uridine as well as the cytidine more commonly 
seen among isolates characterized as 2b at our 
institution. 

The discordant clinical isolate is a 2b/1a recombinant 
virus with a crossover at the NS2/NS3 Junction 

Since isolates from four separate blood draws of a single 
patient over 5 months yielded a discordant 2b and la 
genotype from DNA sequencing of the 5' UTR and 
NS5B regions, respectively, both subtype specific 2b and 
la primers were designed and used to amplify from base 
pair 73-313 and 8256-8644 (Table 1, Figure 1) across 
the HCV genome. Strong amplification was seen with 
2b primers throughout the structural regions, but not 
downstream of NS2. Conversely, la primers either failed 
or weakly amplified the structural gene region upstream 
of NS2, but strongly amplified sequences encoding the 
nonstructural proteins. The entire reconstructed genome 
was submitted to GenBank (accession # JF779679 here- 
after denoted RF8_2/la). Previously, we and others have 
noted that subtype la has two distinct clades provision- 
ally termed lal and la2 [17]. Both clades are widely dis- 
persed temporally and across the globe with the 
prototypical H77 strain being representative of the lal 
clade. Phylogenetic analysis of just the NS3-5B region of 
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170 180 190 200 210 220 230 240 250 260 270 

2b/ la Recomb 5'UTR GGAADUACCG GAAAGACUGG GUCCDUUCUU GGACAAACCC ACUCUAUGUC CGGUCAUUUG GGCGUGCCCC CGCAAGACUG CUAGCCGAGU AGCGUUGGGU UGCGAAAGGC # of sequences 

isolatefl 5'XJTR I Y 

isolate#2 5'UTR G U 3 

isolate#3 5'UTR U. .C A 

isolate#4 5'UTR U. .C U C. . . 

isolate#5 5'UTR U U C. . . 4 

isolate#6 5'UTR G U U C. . . 2 

isolate#7 5'UTR U A U C. . . 

la H77 AF009606 5'UTR G. .A .G.C. . .C U G...A. ..C. U. .AG U C 



b 



NS5B Aa # 291 - -- -- -- - 300 - -- -- -- -- 310 - -- -- -- -- 320 ------ 327 

NTLTCYIKARAACRAAGLRDCTMLVCGDDLVVICESA « 
2b/ la Recomb AAC ACC CDC ACC DGC UAC AUC AAG GCC CGA GCA GCC DGU CGA GCC GCA GGG CUC CGG GAC DGC ACC AUG CUC GUG UGU GGC GAC GAC COG GUC GUD AUG OGU GAG AGU GCG 

isola'te A A A A 

isolate B U G A U.A A 3 

isolate C C A A. A U.A A 3 

2b AY232740 A.G ..A A ... .UU G ... AA. . .U A.U GU. ... CC. .UU G A C ... .CA C CAA 

2b NS5B AAM LKIVPI SQ 



Figure 2 Fewer unique sequences of HCV subtype 2b (Panel A) than la (Panel B) were identified at UWHC. A) DNA sequence of the 5' 
UTR of the discordant 2b/la strain is aligned with several representative non-discordant 2b strains, as well as the 5' UTR of the prototypical la 
H77 strain. These 8 sequences represent all the sequence diversity of this region of the 14 2b subtypes found in the 12 months in which both 
the 5' UTR and NS5B regions were sequenced. B) Alignment of the NS5B protein and nucleotide sequence of the 2b/la discordant strain with 
three typical la strains as well as a 2b reference strain. NS5B DNA sequence of isolate A is quite close to the 2b/la strain, but the NS5B DNA 
sequences of isolates B and C are less similar, yet more typical of la strains from UWHC, with 3 identical sequences from the 90 subtype la 
sequences determined in the 12 month period. 



this recombinant strain shows it partitions with strains 
in the la2 clade (Figure 3A) rather than strains in the 
lal clade including H77. A representative 2b genome 
(AY232740), the H77 genome (AF009606), as well as a 
la2 genome (EU255981) were all used as reference 
sequences to compare with the recombinant sequence 
via Simplot. As shown in Figure 3B, the identity of this 
strain is over 95% with a 2b sequence (AY232740) until 
the end of the NS2 gene, at which point it drops off 
below 82%, but climbs to over 95% identity when com- 
pared to a genotype la strain (EU255981). Most break- 
through viremia in patients on protease inhibitors that 
has been characterized to date has occurred in subtype 
la infected patients [18]. This is consistent with a codon 
bias between lb and la isolates in which most lb 
patients require two mutations to occur in the codon 
for residue 155 of NS3, whereas most la patients 
require only a single base pair change to encode for 
lysine rather than arginine. Accordingly, the NS3 
sequence from this recombinant isolate also requires 
only a single base pair change from UUG to encode a 
lysine and subsequently confer protease resistance. 

While HCV recombinants between two different geno- 
types are rare, closely related putative parental strains 
and several different isolates of the RF_2k/lb have been 
sequenced. Based on RNA structural analysis and 
recombinants seen in the plant virus turnip crinkle 
virus, mechanisms that might lead to recombination 



have been proposed including the identification of two 
stable RNA hairpins upstream and downstream of the 
crossover site in the parental strains that are putatively 
destabilized by mutations in the recombinant [14]. For 
example, the hairpin structure 1 (HSl) observed in the 
recombinant is present in the parental 2k strain, but is 
destabilized slightly by the acquisition of two mutations 
present in the recombinant RFl_2k/lb. Additionally, a 
hairpin downstream of the crossover site in the lb par- 
ental strain was destabilized by 5 mutations present in 
the RF_2k/lb recombinant. We examined if this hairpin 
was also present in the RF8_ 2/la reported here or in 
any of the other recombinants with a genotype 2 5' 
UTR-NS2 region. Although, all recombinants exhibited 
RNA base pairing in this region, all secondary structures 
appeared less stable than that of RF_2k/lb (data not 
shown), with two small stems being predicted rather 
than one longer stem. Non-recombinant 2b and the 
RF8_ 2/la have similarly stable hairpins in this region, 
however, neither are as stable as the 2k hairpin (recom- 
binant or parental). 

Discussion 

For over ten years, HCV genotyping has been the criti- 
cal parameter to determine both the likelihood of 
response to therapy, as well as the duration of therapy 
needed to obtain a Sustained Virologic Response (SVR) 
[19-21]. While several HCV genotyping methods exist. 
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Figure 3 The HCV recombinant 2b/1a isolate crosses over near the NS2/NS3 junction. A) Phylogeny of NS3-NS5B black lines represent 
accession numbers previously identified as clade lal, and red lines represent accession numbers identified as la2 [17]. B) Simplot of 2b/la 
recombinant referenced against a subtype la clade 1 (H77), a subtype la clade 2 strain and a subtype 2b reference strain. 



A 



none are FDA approved in the United States. These 
methods are based on primarily targeting only the 
5'structural regions of virus and thus cannot easily iden- 
tify recombinant strains. Based on the last Hepatitis 
Viral Load proficiency survey (HVL-C) administered by 
the College of American Pathologists in 2010 [22], the 
most commonly employed HCV genotyping method, 
utilized by over 60% (110/177) of participating clinical 
diagnostic labs in the United States, is the Versant HCV 
(LiPA) 2.0 assay (manufactured by Innogenetics, distrib- 
uted by Siemens Healthcare Diagnostics). While CE 



marked in Europe and approved for research use only in 
the United States, this assay is based on line probe 
hybridization targeting DNA sequence information 
within the 5' UTR and the contiguous structural core 
(C) region of HCV (Figure 1). While 9% (16/177) of par- 
ticipants did conduct genotyping by DNA sequencing 
using the Siemens Diagnostics TruGene system, sequen- 
cing information is again only obtained from the 5' UTR 
of HCV. These 5' structural regions have been utilized 
historically for genotyping because they are adequately 
conserved such that a limited number of primers or 
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probes can amplify and recognize all isolates, respec- 
tively, but have sufficient diversity to distinguish 
between non-recombinant genotypes 1-6. However, it 
should be noted that information provided solely by the 
5'UTR is insufficient for subtype identification and in 
some cases for genotype identification. 

Several studies looking for HCV recombinants in 
intravenous drug users and other populations where 
multiple exposures are likely have been performed and 
in general recombination does seem to be a rare event 
[23]. While other reports of recombination between dif- 
ferent genotypes exist [24], the DNA sequence of entire 
recombinant genomes and site of recombination 
remains to be determined. Interestingly, all full-length 
recombinants described to date between two genotypes 
have included a 5' portion of genotype 2 [25-29]. So far 
only RFl_2k/lb has been shown to be circulating in 
multiple patients described in Russia [26], Ireland [30], 
Estonia [31] and Uzbekistan [32]. As shown in Figure 4, 
all reported HCV recombinants have similar, but non- 
identical cross over points to the RF8_2/la (reported 
here) or RFl_2k/lb. 

It remains unclear how much of the genome needs to 
be genotype 2 in order for the clinical response to justif)^ 
a 12-24 week treatment course rather than 48 weeks 
advocated for genotype 1. Both within genotype 1, as 
well as between genotypes 1 and 2, there are known dif- 
ferences between the ability of NS5A to bind the ds 
RNA induced PKR [33]. These differences in NS5A 
binding alter the cellular interferon mediated antiviral 
response that in turn has been postulated to explain the 
corresponding clinical response. Clinical response to 
interferon-based regimens depends upon both viral 



factors (including NS5A and E2 glycoprotein) as well as 
host genetic factors, including lambda interferon poly- 
morphisms [34], but the viral genotype assigned by clin- 
ical labs should closely reflect related strains and ideally 
indicate the historical antiviral response for those 
strains. Data from a chimeric mouse model, as well as 
anecdotal clinical data, suggests the RFl_2k/lb strain is 
more resistant to interferon than some genotype 2 
strains [32]. As protease inhibitors and other directly 
acting antivirals become available, it will become 
increasingly important to know the genotype of each 
viral drug target of the isolate infecting the patient in 
order to determine the most effective therapy for that 
patient, and minimize the side-effects of therapy. Data 
from the PROVE 3 Protease Inhibitor trial [35], among 
others [18], suggests that subtyping may be clinically 
useful. Unfortunately, current methods for HCV geno- 
typing primarily solely targeting the 5'UTR and possibly 
contiguous core (C) structural regions do not provide 
sufficient information across the entire genome to detect 
the possibility of recombinant species which may be cri- 
tical for the determination for treatment efficacy. 

In conclusion, we report here the first naturally occur- 
ring HCV recombinant in the United States. While 
clearly an independent event from other recombinants, 
this strain shares several characteristics with those pre- 
viously reported in that it has genotype 2 5' UTR and 
structural genes, and a crossover point near the NS2/3 
junction. At this time we cannot tell whether this 
recombinant strain is circulating in patients besides the 
one reported here, but the patient was viremic from this 
strain for months and likely years. Hybridization probe 
techniques and DNA sequencing targeting only the 5' 
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GenBank Accession Number 
[Reference] 

JF779679 [reported in this study] 

DQ155560 [29] 

AM408911 [28] 

EU643835 [27] 

DQ364460 [25] 

AY587845 [26] 



Figure 4 Comparison of genotypic HCV recombinants previously reported. All full-length recombinants described with a single 
recombination point include a genotype 2 5' UTR. Only the 2k/lb recombinant has been identified in multiple patients. 



Bhattacharya et al. Virology Journal 201 1, 8:458 
http://www.virologyj.eom/content/8/1/458 



Page 7 of 8 



UTR/core regions are frequently used to clinically geno- 
type HCV to determine the dose and duration of ther- 
apy. One advantage of using direct DNA sequencing to 
genotype viruses is that the DNA sequence of amplified 
regions can be aligned with known recombinants, such 
as the strain reported here, particularly if multiple 
regions are sequenced. Using this approach, undiscov- 
ered recombinants may still be missed depending on the 
regions amplified, but at least an assessment of whether 
further testing is needed to rule out known recombi- 
nants can be made. The presence of circulating recom- 
binants of HCV may have significant ramifications for 
the efficacy and selection of therapy. Clearly more com- 
prehensive HCV genotyping is required to ascertain the 
significance of HCV recombinant isolates in clinical 
practice. 
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